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Abstract: Consider a game where Alice generates an integer and Bob wins if he can factor that integer. Tradi- 
tional game theory tells us that Bob will always win this game even though in practice Alice will win given our 
usual assumptions about the hardness of factoring. 

We define a new notion of bounded rationality, where the payoffs of players are discounted by the computation 
time they take to produce their actions. We use this notion to give a direct correspondence between the existence of 
equilibria where Alice has a winning strategy and the hardness of factoring. Namely, under a natural assumption 
on the discount rates, there is an equilibrium where Alice has a winning strategy iff there is a linear-time samplable 
distribution with respect to which Factoring is hard on average. 

We also give general results for discounted games over countable action spaces, including showing that any game 
with bounded and computable payoffs has an equilibrium in our model, even if each player is allowed a countable 
number of actions. It follows, for example, that the Largest Integer game has an equilibrium in our model though 
it has no Nash equilibria or e-Nash equilibria. 
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1 Introduction 

Game theory studies the strategic behavior of self- 
interested rational agents when they interact. In the 
traditional setting of game theory, agents are supposed 
to be perfectly rational, in terms of knowing what 
their strategic options and the consequences of choos- 
ing these options are, as well as being able to model per- 
fectly the rationality of other agents with whom they in- 
teract. However, often in practice, when human beings 
are involved in a strategic game-playing situation, they 
fail to make perfectly rational decisions. Herbert Simon 
first developed this "bounded rationality" perspective. 

In the past couple of decades various models of 
bounded rationality HI 0. [H 0, El 01 have been defined 
and studied by game theorists and computer scientists. 
In this paper, we introduce a new notion of bounded 
rationality based on the perspective of computational 
complexity. We argue that it is natural, and prove that 
it has some nice properties and can be used to obtain 
new connections between game theory and computa- 
tional complexity. 

The main idea is to discount the payoffs of players 
in a game based on how much time they take to play 
their actions, with different players possibly discounted 
at different rates. Of course, we need to define what it 
means for a player to take time to play its action. This 
naturally pre-supposes that each player has some com- 
putational mechanism for playing its strategy - in this 
paper, as in the recent work by Halpern and Pass 10], we 
adopt the probabilistic Turing machine as our compu- 



tational model. This is a computational model which 
is universal, and is also generally considered to be re- 
alizable in Nature. Furthermore, it capture complexity 
via running time and can be used to realize games with 
countable action spaces, unlike say if we were to use fi- 
nite automata: the model typically considered by game 
theorists when studying bounded rationality. 

In this paper, we use exponential discounting, mean- 
ing that the payoff goes down by a factor (1 — <5)* after 
time t, where 5 is a constant. Our main results also hold 
for other notions of discounting, as we discuss in Sec- 
tion o 

The notion of discounting is far from new 0, [l^l 
- indeed much of economic theory depends on it. It is 
a basic economic assumption that people value a dollar 
a year from now less than a dollar today. The discount 
1 — 5 for a specific period is chosen so that an agent is 
indifferent between receiving 1 — 5 dollars now and 1 
dollar at the end of the period. 

Discounting is commonly used for computing cumu- 
lative payoffs in repeated games. We emphasize that 
we discount based on computation time, which means 
that the notion of discounting can now even be used for 
one-shot games even when there is no natural notion 
of input size. The idea of discounting based on com- 
putation time was developed by Fortnow [11], where he 
used it for a variaton on the "program equilibria" frame- 
work devloped by Tennenholtz IU2I1 ; moreover, a single 
discount rate is used for all players. 

Our notion of discounted time has several benefits. 
First, it bounds rationality endogenously rather than ex- 
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ogenously. By this, we mean that the bound on an 
agent's rationality is not imposed from outside, but 
rather arises from the agent's own need to maximize its 
utility. 

Second, discounting has some nice mathematical 
properties. It's time-independent - discounting for r 
steps starting at a time to yields the same relative de- 
crease in payoff as discounting for r steps starting at 
an earlier or later time. Given a discount factor 1—6, 
the discounted payoff behaves like a linear function for 
small t and like an exponential function for large t, 
which accords well with our intuitions for how we value 
computational resources in the real world. We might 
only be marginally more gratified by a computational 
task finishing in 1 second than one finishing in 2 sec- 
onds, but we would certainly be far more annoyed if a 
task finished in 20 minutes than in 10 minutes. 

Also, the discounting model is philosophically ele- 
gant in that it unifies time as viewed by economists and 
time as viewed by computer scientists. Time is an im- 
portant concept both in economics and in computational 
complexity, and we model it in a way that is consistent 
with the perspectives of both fields. 

We use asymmetric discounting in our model - differ- 
ent players may have different discount factors. There 
are a couple of reasons for this. First, players might 
have asymmetric roles in a game, and in this case it 
is natural to give them discount factors. For example, 
a cryptographic protocol can be interpreted as a game 
where players are either honest or adversarial. In this 
setting, it makes sense to model the adversary as more 
patient and therefore having discount rate 6 closer to 1 . 

However, even if all the players are equally patient 
with respect to real time, it still makes sense to give 
them different discount factors. This is because dis- 
counting is done as a function of computational time 
rather than real time, and the relationship between com- 
putational time and real time depends on the power of 
technology. If one player has a much faster computer 
than the others, then it is effectively more patient, in 
that it has a smaller discount factor. For example, con- 
sider a two-player game where the players are equally 
patient in that the payoff for each player halves after 1 
second of real time. Suppose, however, that Player 1 
has a computer with a clock rate of 10 6 operations per 
second, and Player 2 has a computer with a clock rate 
of 10 12 operations per second. Then the discount rate 
Si for Player 1 is approximately 10~ 6 and the discount 
rate 62 for Player 2 is approximately 10~ 12 . 

This is a further advantage of our model, in that it 
factors in the power of technology. Many games to- 
day play out in a virtual setting, eg. the game between 
someone sending their credit card information and a 
malicious adversary seeking to steal their identity, or an 
electronic auction, or even computer chess. In all these 
cases, the power of technology has a critical impact on 



strategy and success in the game, which is not modeled 
adequately by traditional game theory. Not only do we 
model this via the discount rates, but our notion of uni- 
form equilibrium also implicitly models how technol- 
ogy evolves with time. 

Our model exhibits some nice phenomena for gen- 
eral classes of games. We define a new notion of equi- 
librium for our model, which we call "uniform equilib- 
rium". We show that for finite games, there's a uniform 
equilibrium corresponding to every Nash equilibrium. 
For games where each player has a countable action 
space, the situation is even more interesting. It's known 
that Nash equilibria do not exist in general in this case. 
However, under mild assumptions, namely that the pay- 
offs are bounded and computable, we show that uniform 
equilibria always exist even in this case. 

As an example, consider the Largest Integer game, 
where each player outputs a number and the player out- 
putting the largest number wins the entire pot of money 
at stake (with the players sharing the pot equally if they 
output the same number). This is an archetypal example 
of a game which has no Nash equilibria or even approx- 
imate Nash equilibria. The absence of Nash equilibria 
means that traditional game theory provides no predic- 
tive or explanatory framework for how the game will 
actually play out. 

The Largest Integer game does have a uniform equi- 
librium in our framework, and there is an intuitive ex- 
planation of this. Essentially, the Largest Integer game 
models oneupmanship, where each player is trying to 
outdo the other. What is not modeled by traditional 
game theory is that the players expend considerable re- 
sources in this process, which affects their "effective 
payoff". Indeed, as more and more resources are re- 
quired, at some point the players become essentially in- 
different between winning and losing. In our case, the 
resource is time; the equilibrium situation corresponds 
to both players spending so much time coming up with 
and writing down a large number that their payoffs are 
driven to zero by their discount factors. 

1.1 The Factoring Game 

Perhaps the most interesting results in this paper 
concern a close relationship between equilibria in dis- 
counted games and the computational complexity of 
problems. We illustrate this using the Factoring game. 

The Factoring game is a puzzle in the theory of 
bounded rationality. Consider the following game be- 
tween two players Alice and Bob. Alice sends an inte- 
ger n ^ 2 to Bob, who attempts to find its prime fac- 
torization. If Bob succeeds, he "wins" - he gets a large 
payoff and Alice gets a small payoff; if he fails, the op- 
posite happens. 

If formulated as a game in the conventional way, Bob 
always has a winning strategy. However, in practice, 
one would expect Alice to win, since factoring is be- 



2 



lieved to be computationally hard. This is the puzzle: 
to find a natural formulation of the game that captures 
the intuition that Alice should win if factoring is indeed 
computationally hard. 

The Factoring game was first introduced by Ben- 
Sasson, Kalai and Kalai Il3ll and also considered by 
Halpern and Pass J3]. Neither gives an explicit solu- 
tion to the puzzle, instead they give general frameworks 
in which to study games with computational costs. In- 
deed, Ben-Sasson, Kalai and Kalai say in the Future 
Work section of their paper that "it would be interest- 
ing to make connections between asymptotic algorith- 
mic complexity and games". 

We show that the structure of equilibrium payoffs in 
the discounted time version of the game corresponds 
closely to the computational complexity of factoring. 
Specifically, if Factoring is in probabilistic polynomial 
time on average, Bob always wins; if not, there are equi- 
libria in which Alice gets a large payoff. This result 
assumes that the discount rates of the two players are 
polynomially related - we motivate this assumption in 
Section |4] If there's a different relationship between 
the discount rates, then there's a corresponding different 
complexity assumption which characterizes when Alice 
has a winning strategy. In the simplest interpretation of 
our model, where discount rates are determined by the 
power of technology, it can be empirically tested how 
discount rates vary with each other. 

What makes this connection with asymptotic com- 
plexity somewhat surprising is that the notion of input 
length is not explicitly present in our model. Instead, it 
arises naturally from the discounting criterion and our 
notion of uniform equilibrium. 

The Factoring game is relevant not only to game the- 
ory, but also to the foundations of cryptography. There 
has been a lot of research into the connections between 
game theory and cryptography lfl4l Il5ll . but much of 
this has focused on multi-party computation. One can 
define an analogue of the Factoring game for any one- 
way function and obtain similar results; there's nothing 
special about Factoring being used in the proofs. This 
game-theoretic perspective might be useful in studying 
the tradeoff between efficiency of encryption and secu- 
rity in cryptosystems. In general, it would be interest- 
ing to investigate a perspective where the success of a 
cryptosystem depends on the adversary being "bounded 
rational" rather than computationally bounded in some 
specific sense. 

1.2 Further Discussion of the Model 

Here, we further discuss various features of our 
model and compare it to alternative ones. 

Our criteria for a reasonable model is that it should 
be general, i.e., be relevant to a class of situations rather 
than a single specific situation, and that it should have 
explanatory power, i.e., not only should it simply cor- 



respond to an observed phenomenon but provide some 
further insight. For comparative purposes, in the context 
of the Factoring game, one can think of some alterna- 
tive models that predict a win for Alice. For example, 
one could imagine that the players have a fixed finite 
amount of time to make a decision, with Alice given 
say 10 seconds to choose her number, and Bob 100 sec- 
onds to respond with the prime factors. It's clear that 
if Bob can't factor a random large number (which could 
be generated quickly by Alice), he loses, however this is 
an unsatisfactory model in many respects. First, it deals 
with a very specific situation, so it cannot say anything 
about computational complexity or how equilibria de- 
pend on the power of technology. Second, the model 
is inherently non-robust. Bob might be able to factor 
Alice's number in 101 seconds - in a real-life situation, 
this difference shouldn't affect his payoff too much, but 
in this model, it does. By adopting a flexible model 
of bounded rationality, where payoffs degrade contin- 
uously with time, we avoid such pathological effects. 

One way to make the fixed-time model more general 
is quantify over the time limit: to say, for example, that 
if Alice is allowed t units of time, then Bob is allowed 
t 2 units of time. This kind of approach is taken when 
formulating the notion of "computational equilibrium" 
lfl4[[l6ll where they limit the set of machines being used 
to those that run in some security parameter where our 
model makes no such restriction on machines but con- 
trol time with utility. Another problem with the com- 
putational equilibrium model is that though it might be 
consistent with the observed phenomenon, it's unclear 
why the assumptions the model makes should hold. In 
such a case, the model is simply a way to re-formulate 
a phenomenon, rather than an explanation for it. In 
contrast, in our model, there are clear motivations for 
the choices made. Discounting is based on time prefer- 
ence of utility, which is well established and extensively 
used in economics fTofl ■ Also our interpretation of dis- 
count rates in terms of the power of current technology 
matches the intuition that a player armed with a more 
powerful computer should be able to make a more ratio- 
nal decision, i.e., more in its self-interest. Finally, our 
use of asymmetric discount rates models asymmetries 
in the roles of players and in the power of technology 
available to them. 

Regarding some of the more specific choices made, 
one could question why we use exponential discount- 
ing rather than some other form of discounting. Ex- 
ponential discounting is still the discounting model of 
choice in economics and game theory, but there have 
been arguments made that other models such as "hy- 
perbolic discounting" more accurately represent human 
time preference of utility II 1 Ofl . As it turns out, the exact 
choice of discounting model does not matter very much 
to us - our main results on the Factoring game and the 
general result on bounded-payoff games (Theorems [3j 
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Q]and[7]i go through even in the hyperbolic discounting 
model and, we suspect, in any reasonable model of dis- 
counting. 

Another issue which can be debated is whether each 
player's utility is discounted only by its own compu- 
tational time or by some function of its computational 
time and the computational time of the other players. 
In a strategic situation, it seems natural to penalize a 
player only for its own computation. Consider a two- 
player simultaneous-move game, where each player 
plays without knowledge of the other player's action. 
Suppose Player 1 plays first in this game. Should Player 
1 be charged for Player 2's time as well, since the out- 
come is determined only after Player 2 has played? We 
think not, since Player 1 can use its extra time doing 
other things, garnering utility in other ways. Of course, 
if one player plays first, that might seem to "sequential- 
ize" the game. For our model to apply, there has to be a 
mechanism in place to ensure that the players do in fact 
act independently. 

Our model can, in principle, deal with both positive 
and negative payoffs - discounting predicts, as seems 
intuitive, that positive payoffs should motivate agents 
to play quickly, while negative payoffs should cause 
agents to procrastinate. However, in this paper, we deal 
only with games with positive payoffs. This is because 
it's tricky to define what happens if the first player's 
computation finishes within a finite time but the sec- 
ond player's strategy computation never halts. In some 
sense, this corresponds to the second player not play- 
ing the game at all. With strictly positive payoffs, we 
can be guaranteed that in an equilibrium situation, all 
players will play within a finite time - it is in the inter- 
est of all players to play as quickly as possible. A way 
to avoid the issue with positivity of payoffs would be 
to give players preference orderings on outcomes rather 
than ascribing real payoffs, as is often done in game the- 
ory Jl7ll . and have the preference orderings vary with 
computational time. Though perhaps a more accurate 
model, this has the disadvantage of being very cumber- 
some mathematically. 

1.3 Related Work 

Bounded rationality is a rich area, with lots of work 
in the past couple of decades. We survey some of that 
work and clarify the relationship to our ideas, with an 
emphasis on more recent work. There are several ex- 
cellent surveys and references on bounded rationality 

Early work focused mainly on bounded rational- 
ity in the context of the repeated Prisoner's Dilemma 
game, where strategies are modeled as finite automata 
E 11 II 

I2H1 . There were some works during this 
period which modeled strategies by Turing machines 
H 01 > but these works were concerned with Turing 
machine size as a complexity measure rather than time. 



There has also been a good deal of work in the eco- 
nomics literature studying the consequences for eco- 
nomics of the constraint that agents act in computable 
ways Jl, 23, 24], but these works do not deal with com- 
putational complexity. 

Recently there has been a resurgence of interest in 
modeling strategies as general Turing machines. We 
note especially the two papers 1F7L f]~3ll which discuss the 
Factoring game. Rather than specifying an explicit so- 
lution to the puzzle of the Factoring game, these works 
provide general frameworks and results for taking com- 
putational costs into account when playing games. Our 
contribution in this paper is in providing a concrete and 
natural model which captures the cost of computational 
time, and using it to solve the Factoring puzzle. 



Other recent works III ll 11211 consider computer pro- 
grams as strategies, but in the context of a different 
kind of equilibrium known as the program equilibrium, 
where rationality is modeled by letting each player's 
program have as input the code of the other player's 
program. As mentioned earlier, Fortnow ll 111 consid- 
ers discounted computation time in this context to ob- 
tain a broader range of program equilibria rather than 
to model bounded rationality, and he allows only for a 
single discount factor. 

The idea of discounting time has also been proposed 
in the completely different context of verification [25]. 



2 Preliminaries 

We review standard concepts for two-player games. 
For a more detailed treatment, refer to the books by 
Osborne and Rubinstein lfl7ll and Leyton-Brown and 
Shoham 12^1 . 

In this paper, we only consider one-shot games of 
perfect information, where each player makes a single 
move. We represent these games in normal form as a 
four tuple G = (Ai, A2, U\, 112), where Ai is the action 
space for player i. The utility function m : A\ x A2 — > 
5f^° is a payoff function specifying the payoff that ac- 
crues to player i depending on the actions played by 
the two players. We consider both the simultaneous 
version where both players play their actions simulta- 
neously and the sequential version where player 2 can 
base his action on the action taken by player 1 . 

As mentioned before, we assume in this paper that 
payoff functions are always non-negative. 

Strategies describe how the player's choose their ac- 
tions. A pure strategy for Player 1 is simply an element 
of Ai . For simultaneous-move games, a pure strategy 
for player 2 is just an element of A^. For sequential 
games, a pure strategy for player 2 is a function from 
A\ into A?,. We use Si to represent the pure strategy 
space for player i and we extend the utility functions ui 
to strategies in the natural way. 
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A mixed strategy for a player is a probability distri- 
bution over its pure strategies. The payoff for a game 
using the mixed strategies is just the expected payoff 
when each player chooses their strategies independently 
from their chosen distributions. 

A pure-strategy Nash equilibrium (NE) is a pair of 
strategies (si, s 2 ) G Si x 52 such that for any s* G Si 
and si, G S 2 , ui(si, s 2 ) ^ ui(s*, s 2 ) and u 2 (si, s 2 ) > 
W2(si,s 2 )- A pair of strategies is an 77-NE if neither 
player can increase its payoff by more than 77 by play- 
ing a different strategy, given that their opponent plays 
the same strategy as before. For small 77, the players 
might be satisfied with an 77-NE rather than a pure NE, 
since they might be indifferent to small changes in their 
payoff function. 

A mixed-strategy Nash equilibrium is a pair of mixed 
strategies for which neither player can increase their ex- 
pected payoff by playing a different mixed strategy, as- 
suming that their opponent plays the same mixed strat- 
egy as before. The notion of an 77-NE for mixed strate- 
gies is defined in an analogous way to the definition for 
pure strategies. 

The famous theorem of Nash llz/ll states that every 
game over compact action spaces has a mixed-strategy 
Nash equilibrium. When we say "Nash equilibrium" in 
this paper, we mean a mixed-strategy Nash equilibrium 
unless otherwise stated. 

3 Our Model 

The normal-form representation of a game does not 
say anything about how a strategy is actually imple- 
mented by a player. Depending on the method of imple- 
mentation used, there might be further costs incurred - 
the analysis of these costs may itself be game-theoretic. 
This insight is formalized by the notion of a metagame. 
Given a game G, the metagame is a new game which 
augments G by modeling outside factors which are rel- 
evant to playing G. Thus a metagame aims to be a 
more accurate model of how G might play out in the 
real world. 

We consider the machine metagame, which presumes 
that a strategy is implemented by some computational 
process. We model the computational process as a prob- 
abilistic Turing machine, which is a very general model 
of computation. By the Church-Turing thesis, proba- 
bilistic Turing machines can compute any function that 
is effectively computable. The motivation for consider- 
ing probabilistic machines is the idea that randomness 
is also a resource available in the real world. 

In the machine metagame corresponding to a game 
G = (Ai, A 2 ,ui,u 2 ), actions for Player i are prob- 
abilistic Turing machines rather than elements of A4. 
Since we only consider countable strategy sets, for each 
i the elements of Ai may be represented by binary 
strings in some canonical way, with each string repre- 



senting a strategy and each strategy represented by a 
string. If a probabilistic TM played by Player 1 outputs 
a string x with probability p(x), this is interpreted as 
Player 1 playing a strategy x with probability p(x) in 
the game G. 

Now that strategies are Turing machines, computa- 
tional issues can be factored into the game, even though 
for a fixed game, there is no natural notion of an "in- 
put size." We address this issue by discounting each 
player's payoff by the time taken to produce a (repre- 
sentation of a) strategy. The discount factors for the 
two players might be different, reflecting the possibili- 
ties that the game is asymmetric between the two play- 
ers, and that the two players have differing amounts of 
computational resources. 

Given a game G = (Ai, A 2 ,ui 1 u 2 ), we formally 
define the (e, <5)-discounted version of G. This is the 
discounted time machine metagame corresponding to 
G, where the player's computation times are discounted 
by 1 — e and 1 — S respectively. In this game, each 
player's action space is the class of all probabilistic Tur- 
ing machines. Each player's Turing machine gets as in- 
put \l/e] and [1/(5] in binary - this corresponds to the 
players having full information about the game. If the 
game is extensive, Player 2's Turing machine gets as ad- 
ditional input the output of Player l's Turing machine. 

We formally specify how payoffs are determined. We 
first consider the case where both player's Turing ma- 
chines halt on all computation paths. Given a compu- 
tation path z of a probabilistic TM, let t(z) denote the 
length of the computation path (i.e., the time taken by 
the computation), f\{z) G A\ the action in A\ corre- 
sponding to the output of the path z, and f 2 (z) G A 2 
the action in A 2 corresponding to the output of the 
path z. Then the payoff u±(M,N) of Player 1 cor- 
responding to Player 1 playing a probabilistic Turing 
machine M and Player 2 playing N is the expecta- 
tion over computation paths z and w of M and N 
respectively of (1 — e) tz u\{f\{z), f 2 (w)). Similarly, 
the payoff u 2 (M, N) of Player 2 is the expectation of 
(l-SY-u 2 (h(z)J 2 (w)). 

In addition, we require a convention for payoffs on 
non-halting paths. In this case, a player whose machine 
does not halt gets payoff (corresponding to discount- 
ing for infinite time), and if the other player's machine 
does halt, the player gets the maximum possible payoff 
over all actions in A\ of playing its action, discounted 
by the computation time of playing its action. 

We define two new equilibrium concepts, which cor- 
respond to equilibria that are robust when the discount 
rates e and 8 tend to zero. Our motivation for being in- 
terested in this limiting case is that computational costs 
grow smaller and smaller with time (or equivalently, 
computational power increases with time) - this corre- 
sponds to e and 5 approaching 0. 

We say that a pair of probabilistic machines (M, N) 
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is a uniform Nash equilibrium (NE) if for every pair of 
machines (M',N'), 

liminf m(M, N) - m(M', N) ^ 

and 

liminf u 2 (M,N) - u 2 (M,N') ^ 0. 

We say that (M, N) is a strong uniform NE of the dis- 
counted game if there is a function / such that (M, N) 
is an /(e, <5)-NE for the (e, <5)-discounted game, for 
some function / where /(e, 5) tends to when both e 
and <5 tend to 0. 

As the name indicates, the notion of a strong uniform 
NE is a stronger concept since it requires a fixed equi- 
librium strategy pair to be resilient in the limit against 
deviating strategies which might depend on e and 8. In 
contrast, a uniform NE is only required to be resilient in 
the limit against other fixed strategies. 

The definition of uniform equilibrium above assumes 
that e and 5 are independent - i.e., the equilibrium con- 
dition holds irrespective of how 6 varies with e, as long 
as they both tend to 0. In some of our results, we will 
be concerned with the situation where 8 is a function of 
e such that 8^0 when e — > 0. We will abuse notation 
by referring to the corresponding notion of equilibrium, 
where the limit is now taken only as e — > 0, also as a 
uniform equilibrium. 

We say that a payoff pair (it, v) is a uniform equilib- 
rium payoff if there is a uniform equilibrium (M, N) 
such that m(M,N) -» u and u 2 (M,N) —> v in the 
discounted game when e, 8 — > 

The above equilibrium concepts are defined for pure 
strategy NEs, but the definitions extend easily to mixed 
strategy NEs. 

All the definitions above can be generalized easily to 
A^-player games for N > 2 and indeed the results of the 
Section|5]all hold for A^-player games as well. 

4 The Factoring Game 

In our formulation of the Factoring game, the win- 
ning player receives a payoff of 2 (before discounting) 
and the losing player receives a payoff of 1 . The precise 
values of these payoff are not important for our main 
results. 

The (e, <5)-discounted time version of the Factoring 
game is defined in the usual way. In our presentation 
here, we choose 8 = e c , for some constant c > 1. The 
Factoring game is naturally asymmetric. First, it is se- 
quential: Alice chooses an number, and then Bob acts 
based on knowledge of Alice's number. Also, the natu- 
ral application of the Factoring game is to cryptography, 
with Alice using a cryptosystem and Bob trying to break 
it. In this context, by the polynomial-time Church- 
Turing thesis, the computational model Bob uses is at 
most polynomially faster than that of Alice. 



Note that analogues of our results also go through for 
other dependences of 8 on e. The choice we make is 
partly intended to illustrate that our model can capture 
one of the typical assumptions of complexity-theoretic 
cryptography. 

We first show that if Factoring is easy in the worst 
case, then every uniform NE of the discounted game 
yields payoff 2 to Bob. 

Theorem 1 If for all linear-time samplable distribu- 
tions D, Factoring can be solved in probabilistic poly- 
nomial time with success probability 1 — o(l) over D, 
then for all sufficiently large c, the (e, e c )-discounted 
version of the Factoring game has a uniform Nash equi- 
librium with payoff '(1, 2), and (1, 2) is the only uniform 
equilibrium payoff. 

This result follows from the following lemma, which 
gives a tighter connection between the feasibility of 
Factoring and the uniform equilibrium payoffs of the 
discounted game. 

Lemma 2 If, for all linear-time samplable distributions 
D. Factoring can be solved in probabilistic time o(n°) 
with success probability 1 — o(l) over D, then there is a 
uniform Nash equilibrium of the (e, e c )-discounted ver- 
sion of the Factoring game yielding a payoff of (1, 2). 
Moreover, if c > 1, then every uniform equilibrium 
yields payoff (1, 2). 

Proof. We first show the existence of the claimed 
uniform equilibrium giving a payoff of (1, 2), and then 
show that this is the only uniform equilibrium payoff 
achievable. 

The following pair of probabilistic machines (M, N) 
gives a pure-strategy uniform equilibrium with payoff 
(1,2). M simply outputs the number 2. N uses the 
trivial deterministic algorithm for Factoring running in 
exponential time to find a prime factorization for the 
number produced by M. 

As e — > 0, the payoff for this pair of strategies tends 
to (1,2). We now show that (M, N) is a uniform NE 
for the game. 

Since the payoff for Bob is bounded above by 2, irre- 
spective of what it does, it's clear that the advantage it 
can gain from playing a different strategy tends to zero 
as e tends to zero. We still need to show that Alice can't 
do any better in the limit. 

Let S be any (mixed) strategy for Alice - S is a prob- 
ability distribution over probabilistic TMs. Whenever S 
outputs a number, player 1 gets payoff at most 1, since 
Bob factors the number. When S does not output a num- 
ber, player 1 gets payoff 0; thus, in either case, Alice's 
payoff is at most 1 . This shows that Alice can't do better 
than playing M. 

Showing that (1, 2) is the only uniform equilibrium 
payoff possible is more involved. For the purpose of 
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contradiction, let (a, b) be a uniform equilibrium pay- 
off, where either a ^ 1 or b ^ 2. We derive a contra- 
diction. 

We first consider the case o / 1. It cannot be the 
case that a < 1, since Alice can always get payoff at 
least 1 in the limit by just outputting 1, irrespective of 
what Bob does. Thus it must be the case that 1 < a ^ 2. 

Now we show that 6 = 2. Let (S, T) be a uniform 
NE with payoff (a, b). Let 7(e) be the probability that 
S outputs a number with length at most 1 /e, where the 
probability is over the randomness of choosing a strat- 
egy, as well as the randomness in playing one (since 
a pure strategy is a probabilistic TM). We show that 
7(e) — > 1 as e — > 0. For the sake of contradiction, sup- 
pose that the limit infimum of 7(e) is less than a < 1. 
This means that we can choose e arbitrarily small for 
which S outputs a number with length at least 1 /e with 
probability at least 1 — a. Conditioned on outputting 
such a number, the payoff of Alice is at most 2(1 — e) 1 / 6 
which tends to 2/e < 1 as e — > 0. From the previous 
para, we know that Alice gets payoff at least 1 from 
playing S, hence from an averaging argument, we can 
choose e arbitrarily small for which there is a probabil- 
ity (3 bounded away from that Alice outputs a number 
of length at most 1/e and gets a payoff greater than 1. 
We show that in this case, Bob can improve its payoff 
by a non-trivial amount by playing a different strategy 
V. 

When defining T", we use the assumption that Factor- 
ing is easy on average for all linear-time samplable dis- 
tributions (note that this assumption was not used in the 
argument that there's a uniform equilibrium with payoff 
(1, 2)). Consider the linear-time samplable distribution 
D on inputs of length |l/e| defined as follows: Sim- 
ulate S independently k/j3 times for 1/e computation 
steps (where k is a constant to be decided later), and 
output the first number produced by S of length at most 
|l/e|, padded up to length |l/e|, outputting an arbitrary 
number of that length if all the runs of S give numbers 
that are too long. Clearly D is linear-time samplable. 
Here we use the fact that Factoring is paddable to any 
given length (padding here just involves multiplication 
by a power of two). There is some algorithm A that 
works with success probability 1 — o(l) over D, by as- 
sumption. 

Consider the following strategy T' for Bob: it looks 
at the number output by Alice. If this number is at most 
1/e bits long, it applies A to this number. If the number 
is longer, it plays strategy T, The process of looking at 
the number and deciding what to do based on its length 
takes time O (1/e), but if c > 1, then (l-e c )°^^ -> 1 
when e — > 0, and hence this additive term incurs a negli- 
gible discount for Bob. Conditioned on Alice outputting 
a number that's at least 1/e bits long, Bob's payoff is 
the same in the limit when playing strategy T" as when 
playing strategy T. In the other case, Bob gets a pay- 



off of at least 2(1 — e~ k ) in the limit (since he suc- 
cessfully factors while using (o(l/5) time), which for 
large enough k is strictly better than it did when playing 
strategy T, given our assumption that Alice had a prob- 
ability bounded away from of outputting a number at 
most 1/e bits long and getting a payoff greater than 1 
(which would imply Bob got a payoff less than 2). This 
is a contradiction to (S, T) being a uniform NE. 

Thus, we get that 7(e) — ► 1 as e — > 0. But then 
the strategy of Bob which simply applies the o(n c ) fac- 
toring algorithm to the number output by Alice gets a 
payoff of 2 in the limit. This implies that 6 = 2. 

If a ^ 1 and b = 2, it must be the case that 
(a, b) = (1,2) for the uniform NE (S,T), since the 
expected payoff of any pair of strategies in this game is 
bounded above by 3. □ 

Next, we show an essentially converse. If Factoring 
is hard on average, then there is a uniform NE for the 
discounted game with payoff (2, 1). 

Theorem 3 Suppose there is a linear-time samplable 
distribution D for which there is no probabilistic poly- 
nomial time algorithm correctly factoring with success 
probability 0(1) over D on inputs of length n for in- 
finitely many n. Then for every constant c 1, there 
is a uniform NE for the (e, e c )-discounted version of the 
Factoring game with payoff [2, 1). 

The key to the proof of Theorem [3] is in the follow- 
ing Lemma|4]which, similar to above, makes a stronger 
connection between c and the running time of a factor- 
ing algorithm. The uniform NE which we show to ex- 
ist is a simple one where Alice plays a random number 
of length approximately 1/e and Bob halts immediately 
without output. We show that any deviating strategy for 
Bob which gets him an improved payoff in the limit can 
be transformed into a probabilistic polynomial-time al- 
gorithm which factors well on average. 

Lemma 4 Suppose there is no algorithm for factoring 
running in time n c polylog(n) for large enough input 
length n, and succeeding on a 0(1) fraction of inputs 
for infinitely many input lengths n. Then there is a uni- 
form NE for the (e, e c )-discounted version of the Fac- 
toring game with payoff '(2, 1). 

Proof. The following pair of strategies (M, N) is 
a uniform NE. M selects a number of length n(e) = 
\l /e] [1/ log( [1 /e] )] at random and outputs the num- 
ber. N halts immediately without output. 

First we show that this gives payoff (2, 1). It's clear 
that the payoff for N is 1 since it halts without output. 
Therefore the undiscounted payoff for M is 2. We show 
that the discounting makes a negligible difference to 
this, since M doesn't need to spend too much time gen- 
erating a random number of length n(e). Specifically, 



7 



given the number [1 / e] on its input tape, M computes 
n(e) in unary and stores it on a separate tape - this can 
be done in time 0(n(e)). It then generates a random 
number on the output tape, using the computed value of 
n(e) to ensure the number is of the right length. The 
total time taken by M is 0(n(e)) = 0(l/(elog(l/e))), 
and the discounting due to this is (1 — e)°( n ( e )), which 
is 1 in the limit as e — > 0. 

Next we show (M, N) is a uniform NE. Alice has 
payoff bounded above by 2 for any strategy it plays, so 
clearly it cannot do better with a different strategy S. 
The bulk of the work is showing that Bob cannot do 
better. 

Suppose, on the contrary that there is a strategy T for 
Bob such that the strategy pair (M, T) yields payoff at 
least 1 + 7 for Bob for arbitrarily small e, where 7 > 
0. We show how to extract from T an algorithm that 
factors efficiently on average on infinitely many input 
lengths. 

Choose a infinite sequence 61,62... such that for 
each i, 1 ^ i ^ 00, the strategy pair (M,T) yields 
payoff at least 1 + 7 for Bob in the (e,, e£) -discounted 
game, and n(e<) > n(e,_i). Such a sequence exists by 
the assumption that (M, N) is not a uniform NE. 

We show that there must exist a pure strategy N for 
Bob such that there is an infinite set / for which (M, N) 
yields payoff at least 1 + 7/2 for Bob for all such that 
i 6 I. This argument takes advantage of the fact that the 
Factoring game has payoffs bounded above by 2. By a 
Markov argument, it must be the case for each i G N 
that the pure strategies in the support of T which yield 
payoff at least 1 + 7/2 must have probability weight at 
least 7/2. Now, if each pure strategy only yields payoff 
at least 1 + 7/2 finitely often, then we can choose i 
large enough so that the pure strategies yielding payoff 
at least 1 + 7/2 in the (e,, ef) -discounted game have 
probability weight less than 7/2 in the support of the 
mixed strategy T, which is a contradiction. 

Let B = {n(ej), i = 1 G /}. B is an infinite set, by 
assumption. 

We use N to define a probabilistic algorithm A for 
solving Factoring well on average on all large enough 
input lengths in B, contradicting the assumption of the 
theorem. Given an number x of length n, A simply 
runs N on x log(n) times independently, halting each 
run after time n c |~log(n) c+2 ] . If any of these runs out- 
puts numbers y\ and j/2 such that 2/1*2/2 = x,A outputs 
these numbers, otherwise it outputs nothing. The run- 
ning time of A is 0{n c log(n) c+3 ). We prove that for at 
least an 0(1) fraction of strings of length n for infinitely 
many n, A factors correctly with probability 1 — o(l). 

The idea is to analyze the payoff for Bob from the 
strategy (M, N), and show that an expected payoff 
greater than 1 means that a significant fraction of com- 
putation paths must halt quickly and factor correctly. 
Given a number x of length n(e) G B and a compu- 



tation path z of N when given x, let I xz = 1 if path 
z terminates in a correct factoring of x and other- 
wise, t xz be the time taken along path z, and p xz be 
the probability of taking path z. We have that, for any 
x, E zPxz = 1. Let f(x) = E«(l + I xz ) Pxz {l - S)*", 
where S = e c . Then the payoff of Bob is Z x f(x)/2 n ( e \ 
By assumption, this quantity is at least 1 + 7/2. By a 
Markov argument, this implies that for at least a 7/4 
fraction of strings x of length n(e), f(x) 1 + 7/4. 

Fix any such x. We classify the computation paths z 
for the computation of N on x into three classes. The 
first is the set of z for which I xz = 0. This set con- 
tributes at most H zPxz (l — 5) tj!Z ^ Y> z p xz ^ 1 to 
f(x). The next class is the set of z for which I xz = 1 
and t xz ^ 21og(l/(5)/5. This set contributes at most 
S 2 2p X2 (l - «5)2iog(i/<5)/d <; ^ z2pxz 5 < 26 = o(l) 
to f(x) (here the o(l) refers to dependence on n(e) as 
e — > 0). Thus we have that Y, z ^zPxz ^ 7/4 — o(l), 
where Z is the set of z for which I xz = 1 and t xz < 
21og(l/<5)/<5. 

This means that with probability at least 7/4 over 
strings x of size n(e) G B, N halts in time at most 
2 log(l/<5)/<5 and outputs factors of x with probability 
at least 7/4— o(l). This implies that for all large enough 
n G B, with probability at least 7/4 — o(l) over num- 
bers of size n, N halts in time at most n c |~log(n) c+2 ] 
and factors x with probability at least 7/4 — 0(1) (we're 
simply upper bounding the time as a function of n rather 
than of 5). 

Since A amplifies the success probability of N by 
running it log(n) times independently, the success prob- 
ability of A is at least 1 — o(l) on a 0(1) fraction of 
inputs, for infinitely many input lengths. □ 

Essentially the same proof gives a more general ver- 
sion of Lemma|4]- if there is some linear-time samplable 
distribution D such that no probabilistic algorithm run- 
ning in time n c polylog(ri) achieves an 0(1) success 
probability for Factoring over D, then there is a uni- 
form Nash equilibrium for the (e, e c )-discounted Fac- 
toring game achieving a limit payoff of (2, 1). The only 
difference is that M plays a random number selected ac- 
cording to D, and we argue with respect to this distribu- 
tion rather than with respect to the uniform distribution 
when defining the factoring algorithm A. Theorem [3] 
follows immediately from this more general version. 

Unlike in the case of Lemma [2] this is not the only 
uniform Nash equilibrium when Factoring is hard. In- 
deed, an examination of the proof of Lemma [2] shows 
that we did not actually use the assumption when show- 
ing there was a uniform NE with payoff (1, 2); the as- 
sumption was only to prove the second part of the the- 
orem. Thus, even when Factoring is hard, there is a 
uniform NE with payoff (1,2). 

However, an important point to note is that the dis- 
counted Factoring game is a sequential game, where 
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Alice plays first. Thus, even though there might be a 
uniform NE with payoff (1,2), Alice can control which 
Nash equilibrium is reached, and it is natural for it to 
select the equilibrium giving it a higher payoff. The key 
question in the discounted Factoring game is whether 
there exists a uniform NE giving Alice a payoff greater 
than 1 - Lemma [2] shows that when Factoring is easy, 
there isn't, and Lemma|4] shows that when Factoring is 
hard on average, there is. This is somewhat related to 
the notion of subgame-perfect equilibria in traditional 
game theory II2611 . It's an interesting challenge to de- 
fine an appropriate notion of subgame-perfection for 
our model which could also be used in a variation of 
our model where both Alice and Bob are discounted by 
the total time taken by the two of them. 

If one interprets Alice getting a payoff higher than 1 
as Player 1 "winning" the game, this result is in close 
accordance with intuition. Alice wins the game if and 
only if Factoring is hard. In practice, Factoring is be- 
lieved to be hard, and therefore in practice, we expect 
Alice to win the game, and not Bob as traditional game 
theory would predict. 

The uniform equilibrium in the statement of 
Lemma [2] yielding a payoff of (1,2) is in fact also a 
strong uniform equilibrium - this follows easily from 
the proof. Can Alice hope for a strong uniform equilib- 
rium yielding it a payoff of 2 in the case that Factoring 
is hard? The answer is no. 

Theorem 5 Consider the (e, 5) discounted version of 
Factoring, where 6 = o(e). Let (S, T) be any strong 
uniform NE of this game. Then the payoff pair corre- 
sponding to (S, T) is (1, 2). 

Proof. The proof is very similar to the proof of the 
second part of Lemma [2] except that we can no longer 
use the assumption that Factoring is in polynomial time. 
But we can use an alternate strategy N e for Bob which 
plays the role of the factoring algorithm in the proof of 
Lemma [2] 

N e simply implements a look-up table, which stores 
the numbers which S may output, along with their fac- 
tors. N e need only store numbers of length 1/e, together 
with their factors. The key is that just by encoding the 
look-up table in its state machine, N e can find the fac- 
tors of the number output by S in time 0(1/ e), and 
since 5 = o(e), this means that the discount factor is 
1 in the limit. The rest of the argument is the same as in 
the proof of the second part of Theorem|2] □ 

Of course the dependence of the strategy of Bob on e 
is essential, since we know that there is a uniform equi- 
librium yielding Alice a payoff of 2 in the limit. More- 
over, the proof illustrates why the notion of a strong uni- 
form NE might be too strong an equilibrium concept - 
Bob can push Alice's limit payoff down to 1, but the 



proof involves it playing strategies whose sizes grow ex- 
ponentially in 1/e! For small values of e, this is clearly 
infeasible. 

The issue here is that there is a tradeoff between hard- 
ware and time. Computations can be made very effi- 
cient by exponentially increasing hardware, but in the 
physical world, both hardware and time are costly. Our 
model explicitly captures the idea of time being costly 
through discounting, but the expense of hardware is 
captured implicitly in the uniform equilibrium concept. 

There are other ways of defining equilibrium con- 
cepts which can capture the cost of hardware in a 
more explicit manner. For instance, we could define 
an f(e, (5)-resilient uniform NE as a uniform NE where 
no player gains in the limit by playing a pure strategy 
whose size is bounded by /(e, 5). Since a pure strat- 
egy is just a probabilistic Turing machine, "size" has a 
natural representation - it's the number of bits required 
to explicitly present the state space, transition function 
and alphabet of the Turing machine. A uniform NE as 
we define it an 0(l)-resilient uniform NE, while strong 
uniform NE are /(e, <5)-resilient uniform NE for / arbi- 
trarily large. 

Now let us consider /(e, <5)-resilient NE where 6 
is polynomially bounded in e, and / is polynomially 
bounded in 1/e. By using essentially the proof of Theo- 
rem|4] as well as the fact that a probabilistic Turing ma- 
chine of size K and operating in time T can be simulated 
by a probabilistic Boolean circuit of size 0(K + T) 2 , 
we get that there there is an /(e, <5)-resilient uniform NE 
giving Alice a payoff of 2 in the limit, unless Factor- 
ing can be solved correctly by polynomial-size circuits 
on an 51(1) fraction of inputs, for large enough input 
lengths. 

Thus, not only does is the difference between feasi- 
bility and infeasibility of factoring captured by a dif- 
ference in the structure of equilibria for the Factoring 
game, but by a natural modification of the notion of 
uniformity, we can capture the difference between uni- 
formity and non-uniformity! This raises the possibility 
that there might be interesting concrete complexity no- 
tions that might be captured by game theory as well - 
we need not restrict attention to what happens in the 
limit as e — > 0. Perhaps there are novel notions of com- 
plexity that can be extracted from the game-theoretic 
viewpoint, which give a better understanding of the gap 
between finite complexity and asymptotic complexity? 

We conclude this section by discussing our choice of 
parameters for the Factoring game, and showing that 
the results are robust to the choices we make. First, we 
examine the payoffs. Any choice of payoffs which are 
all positive and for which Bob gets strictly more (resp. 
Alice gets strictly less) if Bob succeeds in factoring will 
yield essentially equivalent results. 

Second, we discuss the discount factors. Our choice 
of dependence of 5 on e was made to illustrate nicely the 
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correspondence between infeasibility and the existence 
of equilibria yielding Alice a high payoff. But the poly- 
nomiality of the dependence is not critical to our proofs 
- in general, if 1/6 = /(1/e) for some function /, then 
our results hold when feasibility means solvability in 
time o(f(n)) and infeasibility means unsolvability on 
average in time slightly more than f(n). 

In the special case that S = e, we get that Alice has a 
winning strategy under the natural assumption that Fac- 
toring is not in quasi-linear time on average. 

5 Properties of Discounted Time Games 

The most fundamental results in a theory of games 
of a given form concern existence of equilibria. Here 
we prove a couple of results of this form. The first 
result shows that the concept of uniform equilibrium 
for the discounted version of a finite game corresponds 
nicely to the concept of Nash equilibrium for the origi- 
nal game. The second result complements this by show- 
ing that discounted games might have equilibria that the 
original game does not possess. 

We show that any Nash equilibrium in a finite game 
G translates to a strong uniform Nash equilibrium yield- 
ing the same uniform payoff in the discounted version 
of G. 

Theorem 6 Let G be a finite two-player game. Given 
any Nash equilibrium (S, T) ofG, there is a strong uni- 
form Nash equilibrium (S',T') of the discounted ver- 
sion of G which yields the same payoff in the limit as 
e, S -> 0. 

Proof. We assume that G is a finite two-player game in 
normal form. If G is sequential and given in extensive 
form, we just consider the image normal-form game, 
which is known to inherit its equilibria from the sequen- 
tial game. 

Let (S, T) be a (possibly mixed-strategy) NE of G. 
We define a strategy pair (S', T') for the discounted 
version of G, and argue that this is a strong uniform 
Nash equilibrium for the discounted version, with the 
same payoffs for both players in the limit. Given any 
pure strategy si of a player in G, choose in an arbitrary 
way a Turing machine M Sl which ignores its input and 
halts after outputting a representation of si. If S gives 
probability pi to strategy si, then we give machine M Sl 
probability p\ in 5". T' is defined in an analogous way 
given T. 

The key point is that irrespective of the way the rep- 
resentative machines for strategies are chosen, they are 
guaranteed to halt in finite time. As S and e approach 
zero, the discount factors approach one, and hence the 
payoff in the discounted game from playing (S', T") ap- 
proaches the payoff from playing (S, T) in G. 

It still remains to be shown that (S', T") is an rj-NE 
for the discounted game, where 77 — s> when e,S — > 0. 



This would imply that (S', T") is a strong uniform NE 
for the discounted game. We show that player 1 can- 
not gain a significant advantage from playing a differ- 
ent mixed strategy S[ - the analogous result holds for 
Player 2 as well. 

Any mixed strategy S[ in the discounted game can be 
transformed into a mixed strategy Si in G - each pure 
strategy is given the same probability of being played in 
G as it has of being output by a probabilistic TM in the 
discounted game (the probability weight of non-halting 
computation paths is assigned to an arbitrary strategy in 
Si). Because of the discounting, the payoff that Player 
1 can get by playing S[ in the discounted game is at 
most the payoff that he can get by playing Si in G. But 
the payoff by playing S in G is at least the payoff by 
playing Si in G, and the payoff by playing S' in the 
discounted game approaches the payoff by playing S in 
G as e, S — > 0. This shows that the advantage of playing 
S[ in the discounted game must tend to zero as e, 5 tend 
to zero, for an arbitrary S[, implying that (S', T') is a 
strong uniform NE for the discounted game. □ 

Consider the Largest Integer Game where both play- 
ers simultaneously play integers. The player playing the 
largest integer receives a payoff of 100 with each receiv- 
ing 50 if they play the same integer. This game has no 
Nash equilibrium or even an almost Nash equilibrium 
(Nash's theorem doesn't apply because the action space 
is not compact). 

Next we show that almost-NEs exist, not only for 
the Largest Integer game for but any countable game 
with bounded payoffs. The basic idea of the proof is to 
approximate the discounted countable game by a finite 
game, and then reduce the existence of uniform equilib- 
ria in the discounted countable game to the existence of 
NEs in the corresponding finite game. 

Theorem 7 Let G be a two-player game with bounded 
payoffs where both players have a countable number of 
actions. Then for each e,S > 0, the (e,8)-discounted 
time version ofG has an (e + S)-NE. 

Proof. Let G be as stated in the theorem, and let K 1 
be an upper bound on payoffs for G. Consider the (e,6)- 
discounted time version of G. We show how to approx- 
imate the discounted game by a finite game G e ^ and 
then use the existence of Nash equilibria in the finite 
game to show the existence of approximate Nash equi- 
libria in the discounted game. 

The finite game G e j is the subgame of the discounted 
game where the first player plays probabilistic Turing 
machines of description size at most 2 2K / e , and the 
second player plays probabilistic Turing machines of 
size at most 2 2K '/ S . By Nash's theorem, this game has 
a mixed-strategy Nash equilibrium (Si,Ti). We show 
that (Si, Ti) is an (e + <5)-NE for the discounted game. 
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We show that for any mixed strategy pair (S, T) in 
the discounted game, there is a mixed strategy pair 
{S',T') in G e>s such that u 2 {S,T') > P 2 {S,T) - 8, 
and ui{S',T) ^ Pi(S,T) - e. This implies that any 
NE for G e j is an (e + <5)-NE for the discounted game. 

Let (S, T) be a mixed strategy pair in the discounted 
game. We show how to construct a strategy T" in G £i 5 
for Player 2 such that u 2 (S, V) ^ u 2 (5, T) - 5. The 
corresponding result for Player 1 follows by a symmet- 
ric argument. 

The argument is a "probability-shifting" argument - 
we will show how to transfer probability from proba- 
bilistic machines in the support of T with size more than 
2 2K / s ~ to probabilistic machines with description size 
smaller than that number without damaging the payoff 
of Player 2 too much. Specifically, the payoff of Player 
2 will not decrease by more than S conditional on that 
strategy being played, and hence there will not be more 
than a 5 decrease in total. 

Let TV be a probabilistic machine of size more than 
2 2K / s which has non-zero weight in T. We define 
a corresponding machine TV' of size at most 2 2K ^ , 
and transfer all the probability weight of TV to TV' in 
T". Essentially, TV' will be indistinguishable from TV 
relative to the discounting. 

The key observation is that we don't need to take into 
account computation paths in TV of length greater than 
K 2 /S 2 , because the strategies output on such compu- 
tation paths are so radically discounted that we may as 
well assume they yield zero payoff, without incurring 
too much damage to the overall payoff. TV' behaves like 
TV "truncated" to K 2 /S 2 steps, outputting a strategy for 
G if TV does within that time, and looping otherwise. 

We cannot simply simulate TV using a universal ma- 
chine and a clock, since the simulation takes too much 
of a time overhead and does not preserve the payoff 
to within a small additive overhead. Instead we sim- 
ulate TV in hardware - this is much more time efficient. 
Specifically, we're interested in the behavior of TV only 
for the first K 2 /S 2 time steps. We can define a Tur- 
ing machine TV' with description size at most 2 2K ~/ S 
which encodes the relevant behavior of TV' entirely in 
its finite state control. This simulation incurs no time 
overhead at all. 

Now, we calculate the maximum damage to Player 
2's payoff from playing TV' instead of TV. There is 
no damage to the payoff from computation paths of TV 
which terminate within K 2 /S 2 steps. Thus the loss in 
payoff is bounded above by (1 — S) K / s K, which is at 
most S if K ^ 1. This finishes the argument. □ 

In case the payoffs of the game G are computable, 
we get a stronger version of Theorem [7] in that uniform 
equilibria are guaranteed to exist. 



Theorem 8 Let G be a two-player game where each 
player has a countable number of actions, and suppose 
the payoffs are bounded and computable. Then the dis- 
counted time version of G has a uniform equilibrium. 

Proof Sketch. The proof is similar to the proof of Theo- 
rem|7] but we take advantage of the fact that payoffs are 
computable. As in the proof of Theorem|7] we can de- 
fine a finite truncated version of the game such that the 
almost-Nash equilibria of the truncated game are also 
almost-Nash equilibria of the discounted game. In or- 
der to ensure uniformity, however, we have to produce a 
fixed pair of strategies such that as e, 5 — > 0, neither can 
gain a non-zero amount in the limit by using a different 
strategy. 

The basic idea is to define a strategy pair (M, TV) 
such that M and TV deterministically compute an 
almost-Nash equilibrium of the truncated game, with 
M proceeding to play the strategy of player 1 in the 
computed almost-Nash equilibrium, and TV proceeding 
to play the strategy of player 2. There are two obstacles 
to this approach. The first is the computational obsta- 
cle, but this can be circumvented since the entries of the 
payoff matrix for the truncated game can be estimated 
to any desired accuracy using sampling and the com- 
putability of the payoffs of the orig inal game, and then 
the Lemke-Howson algorithm 12811 can be used to find 
almost-equilibria of the truncated game. 

The second obstacle is that computing an almost- 
Nash equilibrium of the truncated game incurs a sub- 
stantial time overhead, which already drives the pay- 
offs of the two players down before they play the strate- 
gies corresponding to the almost-Nash equilibrium, not 
to mention the simulation overhead from using a sin- 
gle machine (M or TV) to find an almost-Nash equi- 
librium for all e, 6 > 0. This obstacle is overcome 
using the idea of "miniaturization" - given discount 
rates e and 6 respectively, the players pretend that their 
discount rates are e' and 5' instead, where 1/e' and 
1/5' grow very slowly as a function of 1/e and 1/8. 
e' and 5' are chosen so that the players can compute 
an almost-Nash equilibrium of the (e', <5')-discounted 
game quickly enough that their payoffs in the (e, 8)- 
discounted game are hardly affected by this computa- 
tion, and that playing the strategies for the truncated 
game takes relatively little time as well. The point is 
that this is still an (e' + <5')-NE for the discounted game, 
and that e',6' — ► as e, 8 — > 0. Hence it is a uniform 
Nash equilibrium. □ 

The bounded-payoff assumption in Theorems|7]and[8] 
is essential for the conclusion to hold. Indeed, consider 
the two-player game where Player 1 derives a payoff of 
2 l from playing integer i and Player 2 a payoff of V 
from playing integer j. It is not hard to see that this 
game does not even have almost-NEs in the discounted 
game. 
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Theorem |7] shows that the discounted version of the 
Largest Integer Game does have almost-NEs. For the 
Largest Integer game, in fact, there is a strong uniform 
equilibrium which yields a payoff of for both players, 
and every uniform equilibrium gives payoff to both 
players in the limit. This is intuitive: the Largest Integer 
game is a game of oneupmanship, where each player 
tries to outdo the other by producing a larger number. In 
the process, they exhaust their computational resources 
(or alternatively, end up spending an inordinate amount 
of time) and end up with nothing. 

In general, uniform equilibrium is a strong notion of 
equilibrium, since there should be no gain in deviating 
irrespective of how e, 6 — ► 0. Suppose we know more 
about the relationship of e and S, say that 6 < e 2 , i.e., 
Player 2 always has more computational power. In this 
case there are equilibria in which Player 2 wins, say 
by outputting 2(1 — e) 3 / 2 while Player 1 outputs (1 — 
e) 3 / 2 . This is again in accordance with intuition - if the 
players are asymmetric, the more patient/computational 
stronger player should win this game (the discount rate 
can be seen, depending on the situation, as either an 
index of patience or of computational power). 
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